# Ultra-low-voltage GDI-based hybrid full adder design for area and energy-efficient computing systems ISSN 1751-858X Received on 30th April 2018 Revised 1st November 2018 Accepted on 6th December 2018 E-First on 31st May 2019 doi: 10.1049/iet-cds.2018.5559 www.ietdl.org <sup>1</sup>School of Electronics Engineering, VIT University, Vellore-632014, India \*Current affiliation: Department of ECE, Presidency University, Bangalore-64, India ⊠ E-mail: dhillsakthi@gmail.com Abstract: In recent years, ultra-low-voltage (ULV) operation is gaining more importance for achieving minimum energy consumption. Full adder is the basic computational arithmetic block in many of the computing and signal/image processing applications. Here, a new hybrid 1-bit full adder circuit which employs both Gate Diffusion Input (GDI) logic and multi-threshold voltage (MVT) transistor logic is reported. The main objective of the proposed MVT-GDI-based hybrid full adder design is to provide minimum energy consumption with less area. The proposed hybrid design is simulated using standard 45 nm CMOS process technology at an ULV of 0.2 V. The post-layout simulation results have shown that the proposed design achieved significant improvements in comparison with the other reported designs by achieving >57%, 92% savings in the Energy and EDP, respectively, with only 14 transistors. Monte—Carlo simulations have also been performed and is found that the proposed design methodology yields full functionality and robustness against local and global process variations. Normalised energy metrics to 32 and 22 nm technologies shows that the proposed design achieves >57% energy savings in prior to the recent works. #### 1 Introduction With the advancements in the technology and increase in the usage of laptops, cellular phones, ipads, Internet of Things (IoT) devices, and other portable communication systems, there are many applications requiring high speed, small area, and low power consumption. So there is a need of circuits with low energy consumption for the design of system components and application-specific processors [1]. This demand makes the circuit design engineers a challenging topic of interest in implementing digital systems. One of the most efficient solutions for achieving minimum energy consumption is to operate the digital circuits at ULV i.e. at near-threshold/sub-threshold voltage of the transistor [2]. Addition, subtraction, multiplication, and accumulation are the most common and extensively used arithmetic operations in many of the very large-scale integration (VLSI) and DSP architectures. The efficient implementation of these arithmetic operations in executing standard algorithms like convolution, digital filtering, and correlation lead to the high-performance DSP's and application-specific processors [3]. The basic building block used for implementing these arithmetic operations is the 1-bit full adder cell. So, enhancing the performance of the full adder cell is essential to enhance the overall system/architecture performance. Many full adder designs employing different logic styles and technologies have been reported in the literature [4-19]. Some designs are based on single logic style and some other designs use multiple logic styles (hybrid designs). Although the functionality of every full adder design is similar but each are having its own merits and demerits in terms of performance parameters like area, speed, and power consumption. # 1.1 Brief review on existing full adder circuits Static Complementary Metal-Oxide-Semiconductor (C-CMOS) full adder design is the most conventional approach [4]. The design consists of 28 transistors with PMOS pull-up transistors and NMOS pull-down transistors looks like a regular CMOS structure. The main advantage of this structure is its robustness against supply voltage scaling and transistor sizing. It also provides full swing logic which is essential in designing complex structures. The drawback in this structure is its high input capacitance and more area because of employing large PMOS transistors in its structure. Mirror adder is one of the smart designs which is similar to the static CMOS full adder design in terms of power consumption and transistor count but with less carry propagation delay than the C-CMOS [5]. The Complementary Pass Transistor logic (CPL) full adder is another conventional design employing 32 transistors [8, 9]. In pass transistor logic, the source of the pass transistor is connected to some input signals instead of supply lines as in CMOS. This adder logic provides good voltage swing restoration but is not an appropriate choice for low power applications because of its many intermediate switching nodes and more transistor count. To address the voltage degradation problem in pass transistor logic, transmission gate logic full adders are proposed [10, 19]. The Transmission Function Full Adder (TFA) employing 16 transistors was proposed by Alioto et al. [10] and another Transmission Gate Full Adder (TGA) employing 20 transistors was proposed by Shams et al. [19]. These adders are based on the transmission function theory and transmission gates. The transmission gate structure is formed by the parallel connection of PMOS pass transistor and NMOS pass transistor. The main advantage in this transmission gate logic structures is low power consumption but this logic is not preferable when a TGA or TFA are cascaded in designing complex structures because of its poor driving capability. Later, many hybrid full adder works have been published to reduce the area, delay, and power [11–18]. Vesterbacka [14] proposed a 14 transistor (14T) hybrid adder but it suffers from pass logic transistors with none full swing. Another similar hybrid adder which employs only 10 transistors was proposed in Hung *et al.* [13]. Both the 10 transistor (10T) and 14T adders suffer from poor driving capability. Zhang *et al.* [11] proposed a Hybrid Pass logic with Static CMOS output drive full adder (HPSC). It uses the six transistor pass logic network for generating the XOR and XNOR functions simultaneously. HPSC full adder produces the full swing logic but at the cost of more delay and transistor count. Another adder which uses hybrid logic style is majority-based adder [12]. It employs only capacitors and static CMOS inverters to generate majority functions and consumes less power due to its low transistor count. Tung *et al.* [16] presented 24 transistors (24T) full Fig. 1 Structure of a basic gate diffusion input (GDI) cell with inputs G, P, and N (a) Originally proposed [22], (b) CMOS compatible [23] adder based on 3-input XOR design which also uses two different logic styles – CMOS and pass transistor logic. Another CPL-based Hybrid Full Adder (FA-Hybrid) was proposed by Goel *et al.* [15], which uses a novel XOR/XNOR design employing NMOS transistors and cross-coupled PMOS transistors to improve speed. As of using static CMOS inverters at the output, it provides better driving capability but suffers from more power consumption. Aguirre et al. [17] proposed two hybrid designs: Swing Restored Pass transistor Logic Full Adder (SRPL-FA) and Double Pass transistor Logic Full Adder (DPL-FA). These full adders are designed using groundless/powerless pass transistors for energy efficient computation. To obtain the full swing logic PMOS restoration transistors are used in SRPL-FA where as in DPL-FA complementary transistors are used. Another hybrid 16 transistor full adder (16T Hybrid) was proposed by Bhattacharyya et al. [18]. This full adder design was employed using weak inverters in the sum generation module and strong transmission gates in the carry module to reduce the PDP. Another full adder design makes use of full swing GDI-based AND, OR, and XOR gates to provide full swing outputs [20]. This design requires more number of transistors because of using swing restoration transistors for all the individual gates. Recently, a new full adder circuit (10T) was proposed for ultra-low power (ULP) applications which comprises of only 10-Transistors [21]. This design lags in providing energy efficiency and full logic swing, if buffers are not used. However, most of the hybrid full adder designs show improvement in one of the performance parameters power, speed, and area but at the expense of other. As none of the full adder designs existing in the literature shows robust operation in providing full logic swing with area and energy efficient solution at ULV, there is a need for exploring new design methodologies. This paper presents a new energy efficient full adder cell designed using the combination of multi-threshold voltage transistors and GDI technique. The remaining of this paper is organised as follows. Section 2 gives a brief overview of GDI approach. Section 3 presents the design approach of the proposed full adder circuit. Simulation results and performance comparisons of the proposed design with different existing ones are discussed in Section 4. The performance of the 32-bit carry propagation adder is presented in Section 5, followed by the conclusions in Section 6. # 2 Overview of GDI approach This section gives a brief overview on one of the popular digital logic techniques in recent times, Gate-Diffusion Input [22]. Number of complex logic functions can be realised using GDI technique with only two transistors. The GDI logic depends on the use of a simple cell as shown in Fig. 1. The structure of the cell resembles the static CMOS inverter but there are some key differences to note. - GDI cell comprises of 3-inputs: G-common input to both PMOS and NMOS, N-input to the source/drain of the NMOS and P-input to the source/drain of the PMOS. - Body terminals of both the NMOS and PMOS are arbitrarily biased in GDI by connecting to the inputs N and P, respectively. The GDI methodology was originally introduced for fabrication in Silicon on Insulator (SOI) and twin-well CMOS processes [22]. Later, standard CMOS compatible GDI cell was introduced as shown in Fig. 2b [23]. It was shown that most of the logic functions like AND, OR, XOR, and MUX are complex which require 6–12 transistors to implement using conventional static CMOS and transmission gate logic, but the same logic functions can be implemented with only two transistors using GDI cell by simply changing the inputs. Table 1 shows the logic table for implementing various boolean functions using GDI and Table 2 shows the transistor count comparison between the GDI and conventional CMOS implementations of different boolean functions. F1 and F2 are the two universal logic functions offered by GDI which can be used to realise other complex functions more efficiently than the universal NAND and NOR logic gates. # 3 Design approach and operation of the proposed full adder circuit This section gives the design of a new energy efficient hybrid full adder implemented using the MVT-GDI approach. As the authors are operating the circuit at ULV, the transistors will be in subthreshold/weak-inversion region, and the sub-threshold current of a MOS device is given by the (1) [24]. $$I_{\text{Sub}} = I_0 e^{\left(\left(V_{\text{gs}} - V_T\right)/nV_{\text{th}}\right)} \tag{1}$$ where $I_0$ is the drain current when $V_{gs} = V_T$ and is given by $$I_0 = \mu_0 C_{\text{ox}} \frac{W}{L} (n-1) V_{\text{th}}^2$$ (2) The parameters $V_T$ is the threshold voltage, $V_{gs}$ is the gate to source voltage, n is the sub-threshold slope factor $(n = 1 + (C_d/C_{ox}))$ and $V_{\rm th}$ is the thermal voltage (kT/q) of the transistor. From (1), it can be understood that there will be a significant degradation in the performance of the sub-threshold CMOS logic circuits due to the exponential increase in the delay. Although, it was shown that the sub-threshold and gate leakage components for GDI cell are significantly less, compared to a static CMOS gate [22] but there will be a significant impact on the performance of the GDI circuits because of the poor logic swing caused by the $V_T$ drop. In order to reduce this impact, the transistors in the critical path which has threshold drop are replaced with low $V_T$ transistors in the proposed design. However, the use high $V_T$ devices in non-critical paths can reduce the power consumption, but at ULVs, it may lead to functionality failure of the design. So, the authors did not prefer using high $V_T$ cells in the proposed designs in order to improve the performance which is very critical in sub-threshold operation for minimising the energy consumption. Transistor sizing also plays a key role in deciding the performance of the design. Initially, the sizing of the transistors is done based on the theoretical background of the full adder circuit design and the CMOS Fig. 2 Power and delay variations of the proposed full adder design at different process corners. (a) Power, (b) Delay technology [4]. Subsequently, they were varied in the vicinity of the previously set values to obtain the best performance in terms of energy consumption through the simulations. The optimised transistor sizes of the proposed full adder design are summarised in Table 3. The functionality and the structure of the proposed adder design is explained as follows. #### 3.1 Proposed hybrid full adder design In general, the logic functions of a basic 1-bit full adder can be represented as in (3) and (4) $$Sum = (A \oplus B) \oplus C_{in}$$ (3) $$C_{\text{out}} = (\mathbf{A} \cdot \mathbf{B}) + C_{\text{in}} \cdot (\mathbf{A} \oplus \mathbf{B}) \tag{4}$$ The proposed full adder design employs only 14 transistors as shown in Fig. 3. It mainly consists of five logic blocks designed using MVT-GDI technique. One XOR/XNOR, two multiplexer's, one Swing Restored Transmission Gate (SRTG), and the other one is Swing Restored Pass Transistor (SRPT) block. The XOR/XNOR block is designed using GDI technique. Since the path of the inverters used in the XOR/XNOR blocks has no voltage drop, they are incorporated with standard $V_T$ devices. Since the GDI MUX-1, multiplexes the output of the XOR (A $\oplus$ B) and the XNOR (A XNOR B) with a control input ( $C_{\rm in}$ ) to obtain the sum function. Therefore, the (3) can also be represented as in (5). $$Sum = \overline{C_{in}}(A \oplus B) + C_{in}(A \odot B)$$ (5) The carry output ( $C_{\text{out}}$ ) is generated by the GDI MUX-2, which multiplexes the inputs $C_{\text{in}}$ and B with control line from the output of XNOR logic (A XNOR B). Therefore, the (4) can also be represented as in (6). Table 1 Implementation of various boolean functions using GDI cell | N | Р | G | Out | Function | |------------|-----|---|----------------------|----------| | '0' | В | Α | $\overline{A}B$ | F1 | | В | '1' | Α | $\overline{A} + B$ | F2 | | '0' | '1' | Α | $\overline{A}$ | NOT | | В | '0' | Α | AB | AND | | <b>'1'</b> | В | Α | A + B | OR | | С | В | Α | $\overline{A}B + AC$ | MUX | Table 2 Transistor count comparison | Function | No. of transistors required | | | | | | |----------|-----------------------------|-----|--|--|--|--| | | CMOS | GDI | | | | | | F1 | 6 | 2 | | | | | | F2 | 6 | 2 | | | | | | NOT | 2 | 2 | | | | | | AND | 6 | 2 | | | | | | OR | 6 | 2 | | | | | | XOR | 12 | 4 | | | | | | MUX | 12 | 2 | | | | | Table 3 Transistor sizes of the proposed design | Transistor name | Width, nm | Length, nm | |------------------------|-----------|------------| | M1, M5 | 240 | 45 | | M2, M6 | 120 | 45 | | M3, M7 | 360 | 45 | | M4, M8 | 120 | 45 | | M11 | 480 | 45 | | M9, M10, M12, M13, M14 | 120 | 45 | Fig. 3 Proposed 14T MVT-GDI hybrid full adder design $$C_{\text{out}} = (\overline{A \odot B})C_{\text{in}} + (A \odot B)B$$ (6) However, the proposed structure looks similar to many previous XOR/XNOR logic-based designs and authors' previous GDI-based design [25], but none of the previous designs provides full logic swing with only 14 transistors. In the proposed design, the full swing is ensured using a SRTG at the output of the sum and SRPT's at the carry output ( $C_{\rm out}$ ). The functionality of this full adder with respect to states of the transistors is shown in Table 4. It can be observed that the swing restoration transistors (M11, M12, M13, M14) are 'ON' when there is a $V_T$ drop at the output of the sum generation GDI MUX1 and $C_{\rm out}$ generation GDI MUX-2 to provide full swing logic. Since there is no $V_T$ drop at the output in most of the cases as stated in Table 4, the transistors (M11, M12, M13, M14) are also incorporated with standard $V_T$ transistors. ### 4 Results and performance comparisons This section presents the simulation results and the performance analysis of the proposed full adders. The simulations are performed using cadence, 45 nm CMOS technology at an ULV of 0.2 V. The performance parameters power, delay, energy, Energy Delay Product (EDP), and layout area obtained from the simulations are compared with the other designs reported in the literature. The simulation results of the full adders with their transistor count and layout area are shown in Table 5. To maintain uniformity in comparisons, the full adder circuits presented in Table 5 are operated with a frequency of 20 KHz and at 27°C temperature. For extracting fair results, no additional buffers are used for any of the full adder structures. The parameters power, delay, energy, EDP, and area of the proposed full adders provides significant improvements and the performance comparison of the full adders with respect to each parameter is discussed below. From Table 5, it can be observed that the power consumption of the CPL design is found to be the maximum because of huge transistor count which results in more switching activity. This power consumption caused due to the charging and discharging of the load capacitance's and can be expressed as in (7) $$P = \alpha C_L V_{\rm DD}^2 f \tag{7}$$ where $\alpha$ is the switching activity factor, $C_L$ is the load capacitance and f is the operational frequency of the circuit. The GDI design is found to be consumed minimum power because of its less transistor count. However, the 10T design and GDI design comprises of the same transistor count, but the GDI design consumes less power because of its' less leakage current [22]. The proposed design lags in terms of power consumption because of using additional swing restored gates and low $V_T$ devices, which increases the switching activity and leakage components, respectively. This cannot be an issue in terms of energy consumption because of its less delay which is a critical parameter while operating at ULV. The delay metric plays a key role in estimating the energy metric of the circuit while operating at ULV, since the delay increases exponentially with the supply voltage scaling according to (8) [2]. The (8) can be expressed as in (9), for strong inversion operation. $$t_{d\text{sub}} = \frac{C_L V_{\text{DD}}}{I_0 e^{((V_{\text{DD}} - V_T)/n V_{\text{th}})}}$$ (8) $$t_d = \frac{C_L V_{\rm DD}}{\left(V_{\rm DD} - V_T\right)^{\alpha}} \tag{9}$$ As expected, in strong inversion operation, there is no strong dependency of delay on the $V_{\rm DD}$ , whereas in sub-threshold region of operation, the delay is exponentially higher because of the exponentially decreasing ON-current ( $I_{\rm on}$ ). It is clear from (8), there will be an exponential roll-off of speed when the circuit operating at ULV, limiting the range of applications to low or medium frequencies [2]. The delay is measured for all the input transitions, by taking the difference between 50% of the input voltage swing and the 50% of the output voltage swing. As the carry signal ( $C_{\rm in}$ ) in the proposed design is propagated only through a single GDI-MUX, the carry propagation delay path is reduced, leading to a significant improvement in the speed. It can be observed that the delay of the proposed design is significantly less and is found to be achieved >64% savings in comparison with the other designs in the literature. This is achieved because of employing additional swing restored gates and low $V_T$ devices by improving the output driving capability. The delay of the 10T design is more because of its less driving strength at lower supply voltage. The energy and the Energy Delay Product (EDP) are the two important performance metrics which measures the efficiency of a circuit for digital computational systems. From Table 5, it is clear that the proposed design have the best energy and EDP metrics compared to the other designs. The energy and EDP percentage savings of the proposed design with respect to the other designs are shown in Fig. 4. The area of the proposed full adder circuit is obtained from the designed layout using 45 nm technology as shown in Fig. 5. The CPL design occupies more area because of its more transistor count and also the complexity of its layout is high due to the presence of more metal lines in realising the logic. C-CMOS and the mirror adder designs, in spite of having more transistor count also occupies almost same area of TGA design because of its simple and regular structures. The proposed design occupies reasonably less area compared to the other designs excluding the 10T and GDI designs. **Table 4** Function table of the proposed 14T MVT-GDI hybrid full adder design | | | <u> </u> | . • | • • | | · | p. 0 p 0 0 0 0 | | ~ | | | | | | | | | | | | |---|---|--------------|-----|-----|-----|-----|--------------------------|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|--------------------------|--------------| | A | В | $C_{\rm in}$ | M1 | M2 | М3 | M4 | A⊕B | M5 | M6 | A⊙B | M7 | M8 | М9 | M10 | M11 | M12 | M13 | M14 | Sum | $C_{ m out}$ | | 0 | 0 | 0 | ON | OFF | ON | OFF | $V_T$ (M3) | ON | OFF | 1 | ON | OFF | OFF | ON | ON | ON | OFF | ON | 0 | 0 | | 0 | 0 | 1 | ON | OFF | ON | OFF | $V_T$ (M3) | ON | OFF | 1 | OFF | ON | OFF | ON | ON | ON | OFF | ON | 1 | 0 | | 0 | 1 | 0 | OFF | ON | ON | OFF | 1 | OFF | ON | 0 | ON | OFF | ON | OFF | OFF | OFF | ON | OFF | 1 | 0 | | 0 | 1 | 1 | OFF | ON | ON | OFF | 1 | OFF | ON | 0 | OFF | ON | ON | OFF | OFF | OFF | ON | OFF | 0 | 1 | | 1 | 0 | 0 | ON | OFF | OFF | ON | $V_{ m DD}$ – $V_T$ (M4) | OFF | ON | 0 | ON | OFF | ON | OFF | OFF | OFF | ON | OFF | $V_{ m DD}$ – $V_T$ (M4) | 0 | | 1 | 0 | 1 | ON | OFF | OFF | ON | $V_{ m DD}$ – $V_T$ (M4) | OFF | ON | 0 | OFF | ON | ON | OFF | OFF | OFF | ON | OFF | 0 | 1 | | 1 | 1 | 0 | OFF | ON | OFF | ON | 0 | ON | OFF | 1 | ON | OFF | OFF | ON | ON | ON | OFF | ON | 0 | 1 | | 1 | 1 | 1 | OFF | ON | OFF | ON | 0 | ON | OFF | 1 | OFF | ON | OFF | ON | ON | ON | OFF | ON | 1 | 1 | Table 5 Comparison of the simulation results for different full adder cells in 45 nm technology at 0.2 V supply voltage | Designs | Average power, pW | Delay, µs | Energy, aJ | EDP, yJs | Transistor count | Area, µm² | Reference | |------------|-------------------|-----------|------------|----------|------------------|-----------|-----------| | C-CMOS | 2.568 | 1.55 | 3.98 | 6.169 | 28 | 8.13 | [4, 6] | | mirror | 2.523 | 1.534 | 3.87 | 5.937 | 28 | 8.536 | [5] | | CPL | 6.236 | 1.44 | 8.979 | 12.93 | 32 | 10.462 | [8, 9] | | TFA | 2.98 | 1.285 | 3.829 | 4.92 | 16 | 6.66 | [10] | | TGA | 3.277 | 1.213 | 3.975 | 4.821 | 20 | 8.18 | [19] | | 16T-hybrid | 2.506 | 0.978 | 2.451 | 2.397 | 16 | 7.27 | [18] | | 10T | 2.34 | 5.95 | 13.923 | 82.84 | 10 | 5.12 | [21] | | GDI | 1.665 | 3.3 | 5.494 | 18.131 | 10 | 4.93 | [26] | | proposed | 3.053 | 0.344 | 1.05 | 0.361 | 14 | 6.32 | [present] | Fig. 4 Energy and EDP % savings of the proposed full adder design Fig. 5 Layout design of the proposed 14T MVT-GDI hybrid full adder circuit **Fig. 6** Delay distribution of full adders derived by Monte-Carlo simulations. (a) C-CMOS, (b) Proposed 14T MVT-GDI In order to evaluate the effect of local and global process variations on the delay of the proposed full adder design, Monte–Carlo simulations have been performed. As C-CMOS design is the most robust against process variations, the results of the proposed design are compared with C-CMOS to evaluate the robustness of the design. The results obtained are shown in Fig. 6. The proposed design shows comparatively less immunity to process variations with standard deviation ( $\sigma$ ) of only 168.08 ns (83% less than the C-CMOS) and mean ( $\mu$ ) of 0.336 ns (73% less than the C-CMOS). Fig. 7 Technology scaling trends of V<sub>DD</sub> and energy metric [30] The proposed design also provides stable functionality for operation at different process corners-TT (Typical PMOS, Typical NMOS), FF (Fast PMOS, Fast NMOS), FS (Fast PMOS, Slow NMOS), SF (Slow PMOS, Fast NMOS), and SS (Slow PMOS, Slow NMOS). The variations in the power consumption and delay for the proposed design are shown in Fig. 2. As expected, the power consumption and delay are observed to be maximum at FF and SS corners, and minimum at SS and FF process corners, respectively. In order to estimate the performance of the proposed design with the new technology scaling trends following the International Technology Roadmap for Semiconductors (ITRS) [27, 28] and International Roadmap for Devices and Systems (IRDS) [29], the energy metric of the proposed design in 45 nm technology is normalised to 32 and 22 nm and is compared with the recently proposed CMOS hybrid full adder designs [18, 21]. The normalised energy ( $E_N$ ) metric is calculated from the technology scaling trends of energy metric and $V_{\rm DD}$ as shown in Fig. 7 [30, 31]. The comparison table showing the normalised energy metric (from 45 to 32, and 22 nm) of the proposed design in comparison with the recent works is shown in Table 6. It can be observed that the proposed design provides significantly lesser energy consumption and achieves >57% energy savings in comparison with the recent works. # 5 Performance of 32-bit carry propagation adder In order to evaluate the performance of the proposed GDI-based hybrid design in practical applications, the authors have cascaded the proposed 1-bit full adder design to form a 32-bit carry propagation adder structure as shown in Fig. 8. In this, the carry propagation takes place from the first to the last adder block. To ensure the better driving capability of the cascaded design, the authors have incorporated the buffers at appropriate stages in the design. Assuming that the maximum allowed voltage drop of the complete cascaded design would be 0.2 $V_{\rm DD}$ , the number of GDI cells that can be linked between two buffers is given by (10) [22]. As authors' proposed designs based on the GDI approach, the value of N was evaluated from (10) with $V_{\text{drop}} = V_T$ which turns out to be ~2. The simulation of this 32-bit adder design was also carried out in 45 nm technology with and without using buffers. The delay was improved significantly after using buffers. The simulation results are shown in Fig. 9. $$N = \frac{0.2V_{\rm DD}}{V_{\rm drop}} \tag{10}$$ #### 6 Conclusion Here, the authors have developed a new full adder circuit – 14T-MVT-GDI. The simulations were carried out using cadence, 45 nm technology at ULV of 0.2 V. The results are compared with the other standard full adder designs like CMOS, CPL, TGA, and the other hybrid, 10T, and GDI designs reported in the literature. The **Table 6** Normalised energy metric (in 32 and 22 nm) comparison of the proposed design in prior to the recent works | Design | Technology, nm | Normalised energy, aJ | | | | |-----------------|----------------|-----------------------|--|--|--| | GDI [26] | 32 | 4.56 | | | | | | 22 | 4.104 | | | | | 16T-hybrid [18] | 32 | 2.034 | | | | | | 22 | 1.83 | | | | | 10T [24] | 32 | 11.12 | | | | | | 22 | 10.008 | | | | | proposed | 32 | 0.871 | | | | | | 22 | 0.784 | | | | Fig. 8 Simulation test bench of 32-bit carry propagation adder Fig. 9 Performance (power and delay) of the 32-bit full adder using the proposed 1-bit full adder design efficient approach of using low $V_T$ cells in the threshold drop paths, using swing restoring gates at the outputs (to improve the output swing and speed), along with GDI methodology (to reduce the area) leads to significant improvements in the overall performance of the proposed design. However, the proposed design is consuming more power than some of the designs, but still managed to provide >57%, 92% savings in the Energy and EDP, respectively, in comparison with the other designs reported. Also the proposed design provides full voltage swing with only 14 transistors Monte-Carlo simulations show that the proposed designs are robust against local and global process variations. Normalised energy consumption based on the ITRS technology scaling trends also shows that the proposed design achieves >57% energy savings in prior to the recent works. The proposed design was further extended to implement 32-bit full adders with and without using buffers at appropriate stages (after 2 stages). Hence, the proposed full adder circuit design could be used in most of the area and energy efficient computing applications. #### References - [1] Alioto, M.: 'Enabling the internet of things from integrated circuits to - integrated systems' (Springer, New York, 2017, 1st edn.) Wang, A., Chandrakasan, A.: 'Sub-threshold design for ultra low-power [2] wang, A., Ciantianasain, A.: Savermeshoria design for alrea tow-power systems' (Springer, New York, 2006, 1st edn.) Sakthivel, R., Kittur, H.M.: 'Energy efficient low area error tolerant adder - [3] with higher accuracy', Circ. Syst. Signal Process., 2014, 33, (8), pp. 2625- - Rabaey, J.M., Chandrakasan, A., Nikolic, B.: 'Digital integrated circuits: a design perspective' (Prentice Hall, NJ, USA, 2003, 2nd edn.) [4] - Weste, N.H., Harris, D.: 'CMOS VLSI design: a circuits and systems perspective' (Addison-Wesley, Boston, USA, 2010, 4th edn.) [5] - [6] Navi, K., Foroutan, V., Azghadi, M.R., et al.: 'A novel low-power full-adder cell with new technique in designing logical gates based on static CMOS inverter', *Microelectron. J.*, 2009, **40**, (10), pp. 1441–1448 Navi, K., Maeen, M., Foroutan, V., *et al.*: 'A novel low-power full-adder cell - [7] for low voltage', Integr., VLSI J., 2009, 42, (4), pp. 457-467 - Radhakrishnan, D. 'Low-voltage low-power CMOS full adder', *IEE Proc., Circuits Devices Syst.*, 2001, **148**, (1), pp. 19–24 [8] - [9] Zimmermann, R., Fichtner, W.: 'Low-power logic styles: CMOS versus passtransistor logic', IEEE J. Solid-State Circuits, 1997, 32, (7), pp. 1079-1090 - [10] Alioto, M., Cataldo, G.D., Palumbo, G.: 'Mixed full adder topologies for high-performance low-power arithmetic circuits', Microelectron. J., 2007, 38, (1), pp. 130-139 - Zhang, M., Gu, J., Chang, C.H.: 'A novel hybrid pass logic with static CMOS [11] output drive full-adder cell'. Proc. IEEE ISCAS., Bangkok, Thailand, 2003, - [12] Navi, K., Moaiyeri, M.H., Mirzaee, R.F., et al.: 'Two new low-power full adders based on majority-not gates', Microelectron. J., 2009, 40, (1), pp. 126- - Bui, H.T., Wang, A., Jiang, Y.: 'Design and analysis of low-power 10-[13] transistor full adders using novel XOR-XNOR gates', IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., 2002, 49, (1), pp. 25–30 Vesterbacka, M.: 'A 14-transistor CMOS full adder with full voltage-swing - nodes'. Proc. IEEE Workshop Sig. Proces. Syst., Taipei, Taiwan, 1999, pp. 713-722 - [15] Goel, S., Kumar, A., Bayoumi, M.A.: 'Design of robust, energy-efficient full adders for deep-submicrometer design using hybrid-CMOS logic style', IEEE Trans. VLSI Syst., 2006, 14, (12), pp. 1309-1321 - Tung, C.K., Hung, Y.C., Shieh, S.H., et al.: 'A low-power high-speed hybrid [16] CMOS full adder for embedded system'. In Proc. IEEE DDECS., Krakow, Poland, 2007, pp. 1-4 - [17] Aguirre, H.M., Linares, A.M.: 'CMOS full-adders for energy-efficient arithmetic applications', *IEEE Trans. VLSI Syst.*, 2011, **19**, (4), pp. 718–721 Bhattacharyya, P., Kundu, B., Ghosh, S., et al.: 'Performance analysis of a - F181 low-power high-speed hybrid 1-bit full adder circuit', IEEE Trans. VLSI Syst., 2015, **23**, (10), pp. 2001–2008 - [19] Shams, A.M., Darwish, T.K., Bayoumi, M.A.: 'Performance analysis of lowpower 1-bit CMOS full adder cells', IEEE Trans. VLSI Syst., 2002, 10, (2), pp. 20–29 - Shoba, M., Nakkeran, R.: 'GDI based full adders for energy efficient arithmetic applications', *Eng. Sci. Technol., Int. J.*, 2015, **19**, pp. 485–496 Dokania, V., Verma, R., Manisha, G., *et al.*: 'Design of 10T full adder cell for [20] - [21] ultralow-power applications', *Ain Shams Eng. J.*, 2018, 9, (4), pp. 2363–2372 Morgenshtein, A., Fish, A., Wagner, I.A.: 'Gate-diffusion input (GDI): a - power-efficient method for digital combinatorial circuits', IEEE Trans. VLSI Syst., 2002, 10, (5), pp. 566-581 - Morgenshtein, A., Shwartz, I., Fish, A.: 'Gate-diffusion input (GDI) logic in standard CMOS nanoscale process'. Proc. IEEE 26th Conv. of Electrical and [23] Electronics Eng., Eliat, Israel, 2010, pp. 776-780 - [24] Narendra, S.G., Chandrakasan, A.: 'Leakage in nanometer CMOS technologies' (Springer, New York, USA, 2006, 1st edn.) - Kishore, S., Sakthivel, R.: 'Analysis of GDI logic for minimum energy optimal supply voltage'. Proc. Int. Conf. Microelectronic Dev., Cir., & Syst., Vellore, India, 2017, pp. 1–3 Lee, P.M., Hsu, C.H., Hung, Y.H.: 'Novel 10-T full adders realized by GDI - [26] structure'. In Proc. IEEE ISIC., Singapore, Singapore, 2007, pp. 115-118 - International Technology Roadmap for Semiconductors (ITRS), 2007 Edition, - Rachel, C.: 'Transistors could stop shrinking in 2021', *IEEE Spectr...*, 2016, 53, (9), pp. 9–11 [28] - [29] International Roadmap for Devices and Systems (IRDS), 2017 Edition, - https://irds.ieee.org/ Dreslinski, R.G., Wieckowski, M., Blaauw, D., et al.: 'Near-threshold computing: reclaiming Moore's law through energy efficient integrated circuits', Proc. IEEE., 2010, 98, (2), pp. 253-266 - [31] Zackriya, M.V., Kittur, H.M.: 'Content addressable memory-early predict and terminate precharge of match-line', IEEE Trans. VLSI Syst., 2017, 25, (1), pp.